-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Add --link-targets-dir
argument to linkchecker
#143883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add --link-targets-dir
argument to linkchecker
#143883
Conversation
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. |
This comment has been minimized.
This comment has been minimized.
d51bfa1
to
6831c80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand a little more about how your tool works? Is it generating some intermediate files with relative paths, and then translating them to absolute paths? What do the directory structures look like?
The relnotes-api-list tool generates a JSON file with all stabilized APIs in the standard library, and their documentation URLs. These URLs look like I want to add a step to the tool verifying all those links are correct. To do so, my current implementation generates a temporary HTML file with an In practice, this would be like placing my temporary file in
That's why I decided to add the |
Ah, that makes sense, thanks for the explanation! |
6831c80
to
2b12b92
Compare
2b12b92
to
2995467
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=ehuss
with the windows issue fixed.
let entry = | ||
self.cache.entry(pretty_path.clone()).or_insert_with(|| match fs::metadata(file) { | ||
for base in once(&self.root).chain(self.link_targets_dirs.iter()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this from a closure to a loop breaks on windows, because the return
value is no longer the correct type. I think something roughly like this should fix it:
--- a/src/tools/linkchecker/main.rs
+++ b/src/tools/linkchecker/main.rs
@@ -439,15 +439,18 @@ fn load_file(&mut self, file: &Path, report: &mut Report) -> (String, &FileEntry
}
}
Err(e) if e.kind() == ErrorKind::NotFound => FileEntry::Missing,
- Err(e) => {
- // If a broken intra-doc link contains `::`, on windows, it will cause `ERROR_INVALID_NAME` rather than `NotFound`.
- // Explicitly check for that so that the broken link can be allowed in `LINKCHECK_EXCEPTIONS`.
- #[cfg(windows)]
+ // If a broken intra-doc link contains `::`, on windows, it
+ // will cause `ERROR_INVALID_NAME` rather than `NotFound`.
+ // Explicitly check for that so that the broken link can be
+ // allowed in `LINKCHECK_EXCEPTIONS`.
+ #[cfg(windows)]
+ Err(e)
if e.raw_os_error() == Some(ERROR_INVALID_NAME)
- && file.as_os_str().to_str().map_or(false, |s| s.contains("::"))
- {
- return FileEntry::Missing;
- }
+ && file.as_os_str().to_str().map_or(false, |s| s.contains("::")) =>
+ {
+ FileEntry::Missing
+ }
+ Err(e) => {
panic!("unexpected read error for {}: {}", file.display(), e);
}
});
report.report(); | ||
if report.errors != 0 { | ||
println!("found some broken links"); | ||
std::process::exit(1); | ||
} | ||
} | ||
|
||
fn parse_cli() -> Result<Cli, String> { | ||
fn to_canonical_path(arg: &str) -> Result<PathBuf, String> { | ||
PathBuf::from(arg).canonicalize().map_err(|e| format!("could not canonicalize {arg}: {e}")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I'm always slightly uneasy with ever using canonicalize
. I don't think there is anything to change here since it seems to be working. I just wanted to note the risk here.
In my release notes API list tool (#143053) I want to check whether all links generated by the tool are actually valid, and using linkchecker seems to be the most sensible choice.
Linkchecker currently has a fairly big limitation though: it can only check a single directory, it checks all of the files within it, and link targets must point inside that same directory. This works great when checking the whole documentation package, but in my case I only need to check that one file contains valid links to the standard library docs.
To solve that, this PR adds a new
--link-targets-dir
flag to linkchecker. Directories passed to it will be valid link targets (with lower priority than the root being checked), but links within them will not be checked.I'm not that happy with the name of the flag, happy for it to be bikeshedded.